281 research outputs found
Towards a Theoretical Analysis of PCA for Heteroscedastic Data
Principal Component Analysis (PCA) is a method for estimating a subspace
given noisy samples. It is useful in a variety of problems ranging from
dimensionality reduction to anomaly detection and the visualization of high
dimensional data. PCA performs well in the presence of moderate noise and even
with missing data, but is also sensitive to outliers. PCA is also known to have
a phase transition when noise is independent and identically distributed;
recovery of the subspace sharply declines at a threshold noise variance.
Effective use of PCA requires a rigorous understanding of these behaviors. This
paper provides a step towards an analysis of PCA for samples with
heteroscedastic noise, that is, samples that have non-uniform noise variances
and so are no longer identically distributed. In particular, we provide a
simple asymptotic prediction of the recovery of a one-dimensional subspace from
noisy heteroscedastic samples. The prediction enables: a) easy and efficient
calculation of the asymptotic performance, and b) qualitative reasoning to
understand how PCA is impacted by heteroscedasticity (such as outliers).Comment: Presented at 54th Annual Allerton Conference on Communication,
Control, and Computing (Allerton
Optimally Weighted PCA for High-Dimensional Heteroscedastic Data
Modern applications increasingly involve high-dimensional and heterogeneous
data, e.g., datasets formed by combining numerous measurements from myriad
sources. Principal Component Analysis (PCA) is a classical method for reducing
dimensionality by projecting such data onto a low-dimensional subspace
capturing most of their variation, but PCA does not robustly recover underlying
subspaces in the presence of heteroscedastic noise. Specifically, PCA suffers
from treating all data samples as if they are equally informative. This paper
analyzes a weighted variant of PCA that accounts for heteroscedasticity by
giving samples with larger noise variance less influence. The analysis provides
expressions for the asymptotic recovery of underlying low-dimensional
components from samples with heteroscedastic noise in the high-dimensional
regime, i.e., for sample dimension on the order of the number of samples.
Surprisingly, it turns out that whitening the noise by using inverse noise
variance weights is suboptimal. We derive optimal weights, characterize the
performance of weighted PCA, and consider the problem of optimally collecting
samples under budget constraints.Comment: 52 pages, 13 figure
Convolutional Analysis Operator Learning: Dependence on Training Data
Convolutional analysis operator learning (CAOL) enables the unsupervised
training of (hierarchical) convolutional sparsifying operators or autoencoders
from large datasets. One can use many training images for CAOL, but a precise
understanding of the impact of doing so has remained an open question. This
paper presents a series of results that lend insight into the impact of dataset
size on the filter update in CAOL. The first result is a general deterministic
bound on errors in the estimated filters, and is followed by a bound on the
expected errors as the number of training samples increases. The second result
provides a high probability analogue. The bounds depend on properties of the
training data, and we investigate their empirical values with real data. Taken
together, these results provide evidence for the potential benefit of using
more training data in CAOL.Comment: 5 pages, 2 figure
Kinematic Analysis of Prey Capture in Coastal Giant Salamanders (Dicamptodon tenebrosus)
Salamanders use a variety of techniques to capture prey that involves a combination of lingual and jaw prehension. For example, some plethodontid salamanders often use ballistic tongue projection to capture prey. Salamanders of the family Dicamptodontidae, are the largest sized terrestrial salamanders in the world which feed on a diverse array of prey items (arthropods, annelids, small mammals, and reptiles). Objectives of our study were to describe and quantify the behavior of terrestrial adult coastal giant salamanders (D. tenebrosus). While there has been much research conducted on aquatic phase D. tenebrosus, little is known about their terrestrial counterparts. Feeding bouts of three distinct prey types (e.g., crickets, earthworms, and slugs) were recorded using high-speed video (420-1000 frames/second) recorded with a Casio Exlim EX-ZR100 digital camera. For a feeding trial, salamanders were offered a prey item with forceps. Trials were repeated on separated days with each salamander (N=12) being exposed to equal ratios of prey items. Videos were analyzed for velocity of initial strike, lingual projection, lower and upper jaw prehension, and feeding success. Non-metric multi-dimensional scaling analysis indicated significant differences in feeding patterns among prey types. Lingual prehension was the prominent method of ingestion when a small prey item was offered (crickets) and the use of upper and lower mandible were used in a snapping motion with larger prey items (earthworms). Future work will incorporate different prey items, as well as examine prey preference and foraging behaviors of D. tenebrosus. Additionally some comparative analysis will be conducted using the tiger salamander (Abystoma tigrinum) and the tailed frog (Ascaphus truei) on the mechanics of prey capture in amphibian taxa
Streaming Probabilistic PCA for Missing Data with Heteroscedastic Noise
Streaming principal component analysis (PCA) is an integral tool in
large-scale machine learning for rapidly estimating low-dimensional subspaces
of very high dimensional and high arrival-rate data with missing entries and
corrupting noise. However, modern trends increasingly combine data from a
variety of sources, meaning they may exhibit heterogeneous quality across
samples. Since standard streaming PCA algorithms do not account for non-uniform
noise, their subspace estimates can quickly degrade. On the other hand, the
recently proposed Heteroscedastic Probabilistic PCA Technique (HePPCAT)
addresses this heterogeneity, but it was not designed to handle missing entries
and streaming data, nor does it adapt to non-stationary behavior in time series
data. This paper proposes the Streaming HeteroscedASTic Algorithm for PCA
(SHASTA-PCA) to bridge this divide. SHASTA-PCA employs a stochastic alternating
expectation maximization approach that jointly learns the low-rank latent
factors and the unknown noise variances from streaming data that may have
missing entries and heteroscedastic noise, all while maintaining a low memory
and computational footprint. Numerical experiments validate the superior
subspace estimation of our method compared to state-of-the-art streaming PCA
algorithms in the heteroscedastic setting. Finally, we illustrate SHASTA-PCA
applied to highly-heterogeneous real data from astronomy.Comment: 19 pages, 6 figure
CometChip: A High-throughput 96-Well Platform for Measuring DNA Damage in Microarrayed Human Cells
DNA damaging agents can promote aging, disease and cancer and they are ubiquitous in the environment and produced within human cells as normal cellular metabolites. Ironically, at high doses DNA damaging agents are also used to treat cancer. The ability to quantify DNA damage responses is thus critical in the public health, pharmaceutical and clinical domains. Here, we describe a novel platform that exploits microfabrication techniques to pattern cells in a fixed microarray The ‘CometChip’ is based upon the well-established single cell gel electrophoresis assay (a.k.a. the comet assay), which estimates the level of DNA damage by evaluating the extent of DNA migration through a matrix in an electrical field. The type of damage measured by this assay includes abasic sites, crosslinks, and strand breaks. Instead of being randomly dispersed in agarose in the traditional assay, cells are captured into an agarose microwell array by gravity. The platform also expands from the size of a standard microscope slide to a 96-well format, enabling parallel processing. Here we describe the protocols of using the chip to evaluate DNA damage caused by known genotoxic agents and the cellular repair response followed after exposure. Through the integration of biological and engineering principles, this method potentiates robust and sensitive measurements of DNA damage in human cells and provides the necessary throughput for genotoxicity testing, drug development, epidemiological studies and clinical assays.National Institute of Environmental Health Sciences (Training Grant in Environmental Toxicology T32-ES007020)Massachusetts Institute of Technology. Center for Environmental Health Sciences (P30-ES002109)National Institute of Environmental Health Sciences (5-UO1-ES016045)National Institute of Environmental Health Sciences (1-R21-ES019498)National Institute of Environmental Health Sciences (R44-ES021116
Recommended from our members
Is There a Patient Profile That Characterizes a Patient With Adult Spinal Deformity as a Candidate for Minimally Invasive Surgery?
Study designRetrospective review.ObjectivesThe goal of this study was to evaluate the baseline characteristics of patients chosen to undergo traditional open versus minimally invasive surgery (MIS) for adult spinal deformity (ASD).MethodsA multicenter review of 2 databases including ASD patients treated with surgery. Inclusion criteria were age >45 years, Cobb angle minimum of 20°, and minimum 2-year follow-up. Preoperative radiographic parameters and disability outcome measures were reviewed.ResultsA total of 350 patients were identified: 173 OPEN patients and 177 MIS. OPEN patients were significantly younger than MIS patients (61.5 years vs 63.74 years, P = .013). The OPEN group had significantly more females (87% vs 76%, P = .006), but both groups had similar body mass index. Preoperative lumbar Cobb was significantly higher for the OPEN group (34.2°) than for the MIS group (26.0°, P < .001). The mean preoperative Oswestry Disability Index was significantly higher in the MIS group (44.8 in OPEN patients and 49.8 in MIS patients, P < .011). The preoperative Numerical Rating Scale value for back pain was 7.2 in the OPEN group and 6.8 in the MIS group preoperatively, P = .100.ConclusionsPatients chosen for MIS for ASD are slightly older and have smaller coronal deformities than those chosen for open techniques, but they did not have a substantially lesser degree of sagittal malalignment. MIS surgery was most frequently utilized for patients with an sagittal vertical axis under 6 cm and a baseline pelvic incidence and lumbar lordosis mismatch under 30°
Reconstruction of 3D Whole-Body PET Data Using Blurred Anatomical Labels
The diagnostic utility of whole-body PET is often limited by the high level of statistical noise in the images. An improvement in image quality can be obtained by incorporating correlated anatomical information during the reconstruction of the PET data. The combined PET/CT (SMART) scanner allows the acquisition of accurately aligned PET and CT whole-body data. The authors present results of incorporating aligned anatomical information from the CT during the reconstruction of 3D whole-body PET data. They use the FORE+PWLS method for the reconstruction and a label model to incorporate anatomical information via penalty weights. Since in practice mismatches between anatomical and functional data are unavoidable, the labels are “blurred” to reflect the uncertainty associated with the anatomical information. Results show the potential advantage of incorporating anatomical information by using a blurred labels with the penalty weights.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/85864/1/Fessler153.pd
Recommended from our members
Treatment of the Fractional Curve of Adult Scoliosis With Circumferential Minimally Invasive Surgery Versus Traditional, Open Surgery: An Analysis of Surgical Outcomes.
Study Design:Retrospective, multicenter review of adult scoliosis patients with minimum 2-year follow-up. Objective:Because the fractional curve (FC) of adult scoliosis can cause radiculopathy, we evaluated patients treated with either circumferential minimally invasive surgery (cMIS) or open surgery. Methods:A multicenter retrospective adult deformity review was performed. Patients included: age >18 years with FC >10°, ≥3 levels of instrumentation, 2-year follow-up, and one of the following: coronal Cobb angle (CCA) > 20°, pelvic incidence and lumbar lordosis (PI-LL) > 10°, pelvic tilt (PT) > 20°, and sagittal vertical axis (SVA) > 5 cm. Results:The FC was treated in 118 patients, 79 open and 39 cMIS. The FCs had similar coronal Cobb angles preoperative (17° cMIS, 19.6° open) and postoperative (7° cMIS, 8.1° open), but open had more levels treated (12.1 vs 5.7). cMIS patients had greater reduction in VAS leg (6.4 to 1.8) than open (4.3 to 2.5). With propensity matching 40 patients for levels treated (cMIS: 6.6 levels, N = 20; open: 7.3 levels, N = 20), both groups had similar FC correction (18° in both preoperative, 6.9° in cMIS and 8.5° postoperative). Open had more posterior decompressions (80% vs 22.2%, P < .001). Both groups had similar preoperative (Visual Analogue Scale [VAS] leg 6.1 cMIS and 5.4 open) and postoperative (VAS leg 1.6 cMIS and 3.1 open) leg pain. All cMIS patients had interbody grafts; 35% of open did. There was no difference in change of primary CCA, PI-LL, LL, Oswestry Disability Index, or VAS Back. Conclusion:Patients' FCs treated with cMIS had comparable reduction of leg pain compared with those treated with open surgery, despite significantly fewer cMIS patients undergoing direct decompression
- …